4,101 research outputs found
Optimising metadata workflows in a distributed information environment
The different purposes present within a distributed information environment create the potential for repositories to enhance their metadata by capitalising on the diversity of metadata available for any given object. This paper presents three conceptual reference models required to achieve this optimisation of metadata workflow: the ecology of repositories, the object lifecycle model, and the metadata lifecycle model. It suggests a methodology for developing the metadata lifecycle model, and illustrates how it might be used to enhance metadata within a network of repositories and services
On the entropy of protein families
Proteins are essential components of living systems, capable of performing a
huge variety of tasks at the molecular level, such as recognition, signalling,
copy, transport, ... The protein sequences realizing a given function may
largely vary across organisms, giving rise to a protein family. Here, we
estimate the entropy of those families based on different approaches, including
Hidden Markov Models used for protein databases and inferred statistical models
reproducing the low-order (1-and 2-point) statistics of multi-sequence
alignments. We also compute the entropic cost, that is, the loss in entropy
resulting from a constraint acting on the protein, such as the fixation of one
particular amino-acid on a specific site, and relate this notion to the escape
probability of the HIV virus. The case of lattice proteins, for which the
entropy can be computed exactly, allows us to provide another illustration of
the concept of cost, due to the competition of different folds. The relevance
of the entropy in relation to directed evolution experiments is stressed.Comment: to appear in Journal of Statistical Physic
Identification of drug resistance mutations in HIV from constraints on natural evolution
Human immunodeficiency virus (HIV) evolves with extraordinary rapidity.
However, its evolution is constrained by interactions between mutations in its
fitness landscape. Here we show that an Ising model describing these
interactions, inferred from sequence data obtained prior to the use of
antiretroviral drugs, can be used to identify clinically significant sites of
resistance mutations. Successful predictions of the resistance sites indicate
progress in the development of successful models of real viral evolution at the
single residue level, and suggest that our approach may be applied to help
design new therapies that are less prone to failure even where resistance data
is not yet available.Comment: 5 pages, 3 figure
- …